105 research outputs found

    Co-production of contrastive prosodic focus and manual gestures : temporal coordination and effects on the acoustic and articulatory correlates of focus

    No full text
    International audienceSpeech, and prosody in particular, is tightly linked to manual gestures. This study investigates the coordination of prosodic contrastive focus and different manual gestures (pointing, beat and control gestures). We used motion capture on ten speakers to explore this issue. The results show that prosodic focus "attracts" the manual gesture whichever its type, the temporal alignment being stricter for pointing and mainly realized between the apex of the pointing gesture and articulatory vocalic targets. Moreover, it appears that the production of a gesture, whichever its type, does not affect the acoustic and articulatory correlates of prosodic focus

    An EMG study of the lip muscles during covert auditory verbal hallucinations in schizophrenia

    Get PDF
    Purpose: Auditory verbal hallucinations (AVHs) are speech perceptions in the absence of a external stimulation. An influential theoretical account of AVHs in schizophrenia claims that a deficit in inner speech monitoring would cause the verbal thoughts of the patient to be perceived as external voices. The account is based on a predictive control model, in which verbal self-monitoring is implemented. The aim of this study was to examine lip muscle activity during AVHs in schizophrenia patients, in order to check whether inner speech occurred. Methods: Lip muscle activity was recorded during covert AVHs (without articulation) and rest. Surface electromyography (EMG) was used on eleven schizophrenia patients. Results: Our results show an increase in EMG activity in the orbicularis oris inferior muscle, during covert AVHs relative to rest. This increase is not due to general muscular tension since there was no increase of muscular activity in the forearm muscle. Conclusion: This evidence that AVHs might be self-generated inner speech is discussed in the framework of a predictive control model. Further work is needed to better describe how the inner speech monitoring dysfunction occurs and how inner speech is controlled and monitored. This will help better understanding how AVHs occur

    Recognizing prosody from the lips: is it possible to extract prosodic focus from lip features?

    Get PDF
    International audienceThe aim of this chapter is to examine the possibility of extracting prosodic information from lip features. We used two measurement techniques enabling automatic lip feature extraction to evaluate the "lip pattern" of prosodic focus in French. Two corpora with Subject-Verb-Object (SVO) sentences were designed. Four focus conditions (S, V, O or neutral) were elicited in a natural dialogue situation. In a first set of experiments, we recorded two speakers of French with front and profile video cameras. The speakers wore blue make-up and facial markers. In a second set we recorded five speakers with a 3D optical tracker. An analysis of the lip features showed that visible articulatory lip correlates of focus exist for all speakers. Two types of patterns were observed: absolute and differential. A potential outcome of this study is to provide criteria for automatic visual detection of prosodic focus from lip data

    Auditory-Visual Perception of VCVs Produced by People with Down Syndrome: Preliminary Results

    No full text
    International audienceDown Syndrome (DS) is a genetic disease involving a number of anatomical, physiological and cognitive impairments. More particularly it affects speech production abilities. This results in reduced intelligibility which has however only been evaluated auditorily. Yet, many studies have demonstrated that adding vision to audition helps perception of speech produced by people without impairments especially when it is degraded as is the case in noise. The present study aims at examining whether the visual information improves intelligibility of people with DS. 24 participants without DS were presented with VCV sequences (vowel-consonant-vowel) produced by four adults (2 with DS and 2 without DS). These stimuli were presented in noise in three modalities: auditory, auditory-visual and visual. The results confirm a reduced auditory intelligibility of speakers with DS. They also show that, for the speakers involved in this study, visual intelligibility is equivalent to that of speakers without DS and compensates for the auditory intelligibility loss. An analysis of the perceptual errors shows that most of them involve confusions between consonants. These results put forward the crucial role of multimodality in the improvement of the intelligibility of people with DS

    Perception audio-visuelle de séquences VCV produites par des personnes porteuses de Trisomie 21 : une étude préliminaire

    No full text
    International audienceThe speech of people with Down Syndrome (DS) is systematically altered resulting in an intelligibility loss. This was quantified only auditorily. The visual modality could actually improve intelligibility, as is the case for " ordinary " people. The present study compares the way 24 ordinary participants perceive VCV sequences (vowel-consonant-vowel) produced by four adults (2 with DS and 2 ordinary) and presented in noise in three modalities: auditory, auditory-visual and visual. The results confirm an intelligibility loss in the auditory modality for speakers with DS. However, for the two speakers involved in this study, visual intelligibility is equivalent to that of the ordinary speakers and compensates for the auditory intelligibility loss. These results put forward the importance of integrating multimodality to improve the intelligibility of people with DS.La parole des personnes avec trisomie 21 (T21) présente une altération systématique de l'intelligibilité qui n'a été quantifiée qu'auditivement. Or la modalité visuelle pourrait améliorer l'intelligibilité comme c'est le cas pour les personnes « ordinaires ». Cette étude compare la manière dont 24 participants ordinaires perçoivent des séquences VCV voyelle-consonne-voyelle) produites par quatre adultes (2 avec T21 et 2 ordinaires) et présentées dans le bruit en modalités auditive, visuelle et audiovisuelle. Les résultats confirment la perte d'intelligibilité en modalité auditive dans le cas de locuteurs porteurs de T21. Pour les deux locuteurs impliqués, l'intelligibilité visuelle est néanmoins équivalente à celle des deux locuteurs ordinaires et compensent le déficit d'intelligibilité auditive. Ces résultats suggèrent l'apport de la modalité visuelle vers une meilleure intelligibilité des personnes porteuses de T21

    Multimodal Perception of Prosodic Contrastive Focus in French: A Preliminary fMRI Study

    Get PDF
    http://www.zas.gwz-berlin.de/events/summerschool_2007/index.htmInternational audienceContrastive focus is used to emphasize a word or group of words in an utterance as opposed to another. In French, it can be conveyed by prosody using a specific intonational contour on the constituent pointed at (XXXf a mangé la pomme. 'XXXf ate the apple.'). It remains unclear what neural processes underlie the perception of prosodic focus. Meanwhile studies have shown that prosodic processing in general cannot be restricted to the right hemisphere (see [1] for review). Moreover it appears ([2]) that even though the perception of prosodic focus was often considered as uniquely auditory, it is possible to perceive prosodic focus visually and the visual modality can enhance perception when prosodic auditory cues are degraded (whispered speech). This finding emphasizes the necessity to consider the perception of prosodic contrastive focus and speech prosody in general as multimodal. The aim of this study is to analyze the neural processing of prosodic focus from a multimodal point of view

    Left-Dominant Temporal-Frontal Hypercoupling in Schizophrenia Patients With Hallucinations During Speech Perception

    Get PDF
    International audienceBackground: Task-based functional neuroimaging studies of schizophrenia have not yet replicated the increased coordinated hyperactivity in speech-related brain regions that is reported with symptom-capture and resting-state studies of hallucinations. This may be due to suboptimal selection of cognitive tasks. Methods: In the current study, we used a task that allowed experimental manipulation of control over verbal material and compared brain activity between 23 schizophrenia patients (10 hallucinators, 13 nonhallucinators), 22 psychiatric (bipolar), and 27 healthy controls. Two conditions were presented, one involving inner verbal thought (in which control over verbal material was required) and another involving speech perception (SP; in which control verbal material was not required). Results: A functional connectivity analysis resulted in a left-dominant temporal-frontal network that included speech-related auditory and motor regions and showed hypercoupling in past-week hallucinating schizophrenia patients (relative to nonhallucinating patients) during SP only. Conclusions: These findings replicate our previous work showing generalized speech-related functional network hypercoupling in schizophrenia during inner verbal thought and SP, but extend them by suggesting that hypercoupling is related to past-week hallucination severity scores during SP only, when control over verbal material is not required. This result opens the possibility that practicing control over inner verbal thought processes may decrease the likelihood or severity of hallucinations

    Coordination of mouth and hand actions in speech communication

    No full text
    International audienceStarting from an evolutionary perspective on vocal vs. gestural origins of language, we shall propose a view in which the coordination between mouth and hand is given a key role in speech communication. We shall present a framework called “Vocalize to Localize” in which deixis is considered as a possible bootstrap for this coordination in both evolution and development. Then we shall present a number of data, in both infants and adults, describing the hand-mouth coordination in deixis in a quantitative way. Finally, we shall describe a framework for studying the mouth-hand coordination to a larger extent, proposing experimental paradigms concerned with the production and perception of prosodic focus

    Deixis prosodique multisensorielle : production et perception audiovisuelle de la focalisation contrastive en français

    Get PDF
    The work described in this dissertation is grounded by three major findings. Firstly, numerous researchers have shown that speech is not only auditory but also visual. Secondly, prosody i.e. intonation, rhythm and phrasing, plays a key role in speech. Thirdly, deixis is a core phenomenon in spoken communication and its acquisition by infants. Deixis can be achieved using speech: it is indeed possible to "show with the voice" using prosodic focus for example. These observations enable us to assume that prosodic contrastive focus is rooted not only in audition, as has already been widely explored, but also in vision. The various works presented in this dissertation explore this hypothesis for French. Several production studies analyzing the recordings of six speakers using two different and complementary measurement techniques have shown that focus is signaled visually. Speakers use two different strategies regarding the visible articulatory movements: an absolute strategy and a differential one. The measurements have also shown that other non-articulatory facial gestures may be linked to the production of contrastive focus such as eyebrow and head movements. The link is however widely inter and intra speaker dependent. In parallel, perceptual experiments have enabled us to show that the visual correlates of focus are used for focus information extraction when the auditory modality is absent or degraded. It was also shown that the visual correlates identified in the production studies correspond at least in part to those used in audiovisual perception. These studies have thus shown that prosodic contrastive focus is "visible" and "seen". The findings allow us to sketch a cognitive model of the audiovisual production and perception of contrastive focus in French.Le travail présenté dans ce mémoire est sous-tendu par trois observations majeures. D'abord, de nombreux travaux ont mis en évidence que la parole n'était pas uniquement de nature auditive mais aussi visuelle. D'autre part, la prosodie, domaine de l'intonation, du rythme et du phrasé joue un rôle crucial en parole. Enfin, la deixis ou monstration est un phénomène au coeur de la communication parlée et de son acquisition par les jeunes enfants. Or celle-ci peut, entre autre, s'exprimer uniquement avec la parole : il est possible de « montrer de la voix » par la focalisation prosodique par exemple. Ces observations et constatations permettent d'émettre l'hypothèse que la focalisation contrastive prosodique se manifesterait non seulement par la modalité auditive ,comme il a déjà été largement exploré, mais aussi par la modalité visuelle. C'est la piste que les travaux de ce mémoire visent à explorer pour le cas particulier du français. Plusieurs analyses en production de la parole ont ainsi permis, grâce aux enregistrements de six locuteurs avec deux systèmes de mesure différents et complémentaires, de mettre en évidence les stratégies de signalisation visuelle de la focalisation. Il semble que les locuteurs produisent des indices articulatoires visibles selon deux stratégies principales : la stratégie de signalisation absolue et la stratégie de signalisation différentielle. Les analyses ont également permis de montrer que d'autres gestes faciaux non articulatoires (mouvements des sourcils et de la tête) pourraient être liés à la production de la focalisation mais de façon très variable non seulement d'un locuteur à l'autre mais aussi pour un même locuteur. Par ailleurs, des analyses parallèles en perception, ont permis de montrer que les indices visuels mis en évidence en production, étaient effectivement utilisés en perception et qu'ils permettent d'extraire l'information de focalisation quand la modalité auditive est indisponible ou dégradée. Il a été mis en évidence que les indices visuels identifiés en production correspondent au moins en partie à ceux utilisés en perception audiovisuelle. Ces travaux montrent ainsi que la focalisation contrastive en français est « visible » et est « vue ». Ces résultats permettent d'esquisser un modèle cognitif de la production et de la perception audiovisuelles de la focalisation contrastive en français

    Multimodal perception of speech segments and speech prosody in relationship with their production

    No full text
    International audienceThe aim of this talk will be to investigate multimodal perception of speech linked to its production. Both segmental and suprasegmental aspects of speech will be considered. We shall first recall how speech sounds and images are fused in speech perception, starting from speech in noise and the McGurk effect. This will show that speech communication is intrinsically multimodal. We shall discuss possible cognitive architectures for audiovisual fusion. These will be related to recent neurocognitive data on perceptuo-motor links in the human brain, from mirror neurons to the cortical dorsal route of speech perception. Then we shall present a number of recent data we have obtained on audiovisual prosody, dealing with the audiovisual perception and production of contrastive focus
    • …
    corecore